Goto

Collaborating Authors

 deep unsupervised perceptual grouping


Tagger: Deep Unsupervised Perceptual Grouping

Neural Information Processing Systems

We present a framework for efficient perceptual inference that explicitly reasons about the segmentation of its inputs and features. Rather than being trained for any specific segmentation, our framework learns the grouping process in an unsupervised manner or alongside any supervised task. We enable a neural network to group the representations of different objects in an iterative manner through a differentiable mechanism. We achieve very fast convergence by allowing the system to amortize the joint iterative inference of the groupings and their representations. In contrast to many other recently proposed methods for addressing multi-object scenes, our system does not assume the inputs to be images and can therefore directly handle other modalities. We evaluate our method on multi-digit classification of very cluttered images that require texture segmentation. Remarkably our method achieves improved classification performance over convolutional networks despite being fully connected, by making use of the grouping mechanism. Furthermore, we observe that our system greatly improves upon the semi-supervised result of a baseline Ladder network on our dataset. These results are evidence that grouping is a powerful tool that can help to improve sample efficiency.


Reviews: Tagger: Deep Unsupervised Perceptual Grouping

Neural Information Processing Systems

UPDATE: I thank the authors for their convincing rebuttal, and in view of the promised updates on the technical specifications and description of the method, I increased the scores for "Technical quality" and "Clarity and presentation". My only major concern I still have is the lack of a suitable baseline to compare with. In particular, I do not agree that a comparison to [1] is impossible without their code. Instead, I'd encourage the authors to compare their method on the multi-MNIST benchmark described in Figure 1 [1] (and to just use the numbers provided by [1] for comparison without re-simulation). This would significantly strengthen the results. Unfortunately, however, I see two major flaws with the current presentation of the material: ** Literature and comparison to competitors First, the literature on this topic seems not to be suitably accounted for.


Tagger: Deep Unsupervised Perceptual Grouping

Greff, Klaus, Rasmus, Antti, Berglund, Mathias, Hao, Tele, Valpola, Harri, Schmidhuber, Jürgen

Neural Information Processing Systems

We present a framework for efficient perceptual inference that explicitly reasons about the segmentation of its inputs and features. Rather than being trained for any specific segmentation, our framework learns the grouping process in an unsupervised manner or alongside any supervised task. We enable a neural network to group the representations of different objects in an iterative manner through a differentiable mechanism. We achieve very fast convergence by allowing the system to amortize the joint iterative inference of the groupings and their representations. In contrast to many other recently proposed methods for addressing multi-object scenes, our system does not assume the inputs to be images and can therefore directly handle other modalities.